fl framework
BlendFL: Blended Federated Learning for Handling Multimodal Data Heterogeneity
Guerra-Manzanares, Alejandro, El-Herraoui, Omar, Maniatakos, Michail, Shamout, Farah E.
One of the key challenges of collaborative machine learning, without data sharing, is multimodal data heterogeneity in real-world settings. While Federated Learning (FL) enables model training across multiple clients, existing frameworks, such as horizontal and vertical FL, are only effective in `ideal' settings that meet specific assumptions. Hence, they struggle to address scenarios where neither all modalities nor all samples are represented across the participating clients. To address this gap, we propose BlendFL, a novel FL framework that seamlessly blends the principles of horizontal and vertical FL in a synchronized and non-restrictive fashion despite the asymmetry across clients. Specifically, any client within BlendFL can benefit from either of the approaches, or both simultaneously, according to its available dataset. In addition, BlendFL features a decentralized inference mechanism, empowering clients to run collaboratively trained local models using available local data, thereby reducing latency and reliance on central servers for inference. We also introduce BlendAvg, an adaptive global model aggregation strategy that prioritizes collaborative model updates based on each client's performance. We trained and evaluated BlendFL and other state-of-the-art baselines on three classification tasks using a large-scale real-world multimodal medical dataset and a popular multimodal benchmark. Our results highlight BlendFL's superior performance for both multimodal and unimodal classification. Ablation studies demonstrate BlendFL's faster convergence compared to traditional approaches, accelerating collaborative learning. Overall, in our study we highlight the potential of BlendFL for handling multimodal data heterogeneity for collaborative learning in real-world settings where data privacy is crucial, such as in healthcare and finance.
- Information Technology > Security & Privacy (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (0.46)
FedAgentBench: Towards Automating Real-world Federated Medical Image Analysis with Server-Client LLM Agents
Saha, Pramit, Strong, Joshua, Mishra, Divyanshu, Ouyang, Cheng, Noble, J. Alison
Federated learning (FL) allows collaborative model training across healthcare sites without sharing sensitive patient data. However, real-world FL deployment is often hindered by complex operational challenges that demand substantial human efforts. This includes: (a) selecting appropriate clients (hospitals), (b) coordinating between the central server and clients, (c) client-level data pre-processing, (d) harmonizing non-standardized data and labels across clients, and (e) selecting FL algorithms based on user instructions and cross-client data characteristics. However, the existing FL works overlook these practical orchestration challenges. These operational bottlenecks motivate the need for autonomous, agent-driven FL systems, where intelligent agents at each hospital client and the central server agent collaboratively manage FL setup and model training with minimal human intervention. To this end, we first introduce an agent-driven FL framework that captures key phases of real-world FL workflows from client selection to training completion and a benchmark dubbed FedAgentBench that evaluates the ability of LLM agents to autonomously coordinate healthcare FL. Our framework incorporates 40 FL algorithms, each tailored to address diverse task-specific requirements and cross-client characteristics. Furthermore, we introduce a diverse set of complex tasks across 201 carefully curated datasets, simulating 6 modality-specific real-world healthcare environments, viz., Dermatoscopy, Ultrasound, Fundus, Histopathology, MRI, and X-Ray. We assess the agentic performance of 14 open-source and 10 proprietary LLMs spanning small, medium, and large model scales. While some agent cores such as GPT-4.1 and DeepSeek V3 can automate various stages of the FL pipeline, our results reveal that more complex, interdependent tasks based on implicit goals remain challenging for even the strongest models.
- North America > United States > Virginia (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Asia > China > Hong Kong (0.04)
- Workflow (0.90)
- Research Report > New Finding (0.34)
- Health & Medicine > Therapeutic Area > Oncology (0.94)
- Health & Medicine > Diagnostic Medicine (0.69)
Federated learning framework for collaborative remaining useful life prognostics: an aircraft engine case study
Landau, Diogo, de Pater, Ingeborg, Mitici, Mihaela, Saurabh, Nishant
Complex systems such as aircraft engines are continuously monitored by sensors. In predictive aircraft maintenance, the collected sensor measurements are used to estimate the health condition and the Remaining Useful Life (RUL) of such systems. However, a major challenge when developing prognostics is the limited number of run-to-failure data samples. This challenge could be overcome if multiple airlines would share their run-to-failure data samples such that su fficient learning can be achieved. Due to privacy concerns, however, airlines are reluctant to share their data in a centralized setting. In this paper, a collaborative federated learning framework is therefore developed instead. Here, several airlines cooperate to train a collective RUL prognostic machine learning model, without the need to centrally share their data. For this, a decentralized validation procedure is proposed to validate the prognostics model without sharing any data. Moreover, sensor data is often noisy and of low quality. This paper therefore proposes four novel methods to aggregate the parameters of the global prognostic model. These methods enhance the robustness of the FL framework against noisy data. The proposed framework is illustrated for training a collaborative RUL prognostic model for aircraft engines, using the N-CMAPSS dataset. Here, six airlines are considered, that collaborate in the FL framework to train a collective RUL prognostic model for their aircraft's engines. When comparing the proposed FL framework with the case where each airline independently develops their own prognostic model, the results show that FL leads to more accurate RUL prognostics for five out of the six airlines. Moreover, the novel robust aggregation methods render the FL framework robust to noisy data samples. Keywords: Federated and collaborative learning; remaining useful life prognostics; robust parameter aggregation method; distributed data; N-CMAPSS turbofan engines1. Introduction The emergence of distributed computing technologies integrated with machine learning techniques, has gained traction for distributed data processing across various application domains.
- North America > United States (0.28)
- Europe > Netherlands > South Holland > Delft (0.04)
- Asia > Singapore (0.04)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- Transportation > Air (1.00)
- Information Technology > Security & Privacy (1.00)
- Aerospace & Defense > Aircraft (1.00)
- (2 more...)
Federated prediction for scalable and privacy-preserved knowledge-based planning in radiotherapy
Chen, Jingyun, Horowitz, David, Yuan, Yading
Although aggregating data from different institutions could alleviate this problem, data sharing is a practical challenge due to concerns about patient data privacy and other technical obstacles. Purpose: This work aims to address this dilemma by developing FedKBP+, a comprehensive federated learning (FL) platform for predictive tasks in real-world applications in radiotherapy treatment planning. Methods: We implemented a unified communication stack based on Google Remote Procedure Call (gRPC) to support communication between participants whether located on the same workstation or distributed across multiple workstations. In addition to supporting the centralized FL strategies commonly available in existing open-source frameworks, FedKBP+ also provides a fully decentralized FL model where participants directly exchange model weights to each other through Peer-to-Peer communication. We evaluated FedKBP+ on three predictive tasks using scale-attention network (SA-Net) as the predictive model. Results: Using 340 cases (training: 200; validation: 40; testing: 100) from the OpenKBP Challenge, a 3D dose prediction model trained with FedAvg algorithm outperformed the model trained on the local data, and achieved predictive accuracy comparable to that of a centrally trained model using pooled data in both independent and identically distributed (IID) and non-IID settings. We further evaluated the performance of FedKBP+ against NVFlare on the task of brain tumor segmentation using 227 cases from eight sites in the 2021 BraTS challenge dataset (training: 152; validation: 27; testing: 48). FedKBP+ surpassed the NVFlare framework in both accuracy and training efficiency.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Nuclear Medicine (1.00)
FedPaI: Achieving Extreme Sparsity in Federated Learning via Pruning at Initialization
Wang, Haonan, Liu, Zeli, Hoshino, Kajimusugura, Zhang, Tuo, Walters, John Paul, Crago, Stephen
--Federated Learning (FL) enables distributed training on edge devices but faces significant challenges due to resource constraints in edge environments, impacting both communication and computational efficiency. Existing iterative pruning techniques improve communication efficiency but are limited by their centralized design, which struggles with FL's decentralized and data-imbalanced nature, resulting in suboptimal sparsity levels. T o address these issues, we propose FedPaI, a novel efficient FL framework that leverages Pruning at Initialization (PaI) to achieve extreme sparsity. FedPaI identifies optimal sparse connections at an early stage, maximizing model capacity and significantly reducing communication and computation overhead by fixing sparsity patterns at the start of training. T o adapt to diverse hardware and software environments, FedPaI supports both structured and unstructured pruning. Additionally, we introduce personalized client-side pruning mechanisms for improved learning capacity and sparsity-aware server-side aggregation for enhanced efficiency. Experimental results demonstrate that FedPaI consistently outperforms existing efficient FL that applies conventional iterative pruning with significant leading in efficiency and model accuracy. For the first time, our proposed FedPaI achieves an extreme sparsity level of up to 98% without compromising the model accuracy compared to unpruned baselines, even under challenging non-IID settings. By employing our FedPaI with joint optimization of model learning capacity and sparsity, FL applications can benefit from faster convergence and accelerate the training by 6.4 to 7.9 . Federated Learning (FL) [1], [2] has emerged as a promising approach for decentralized machine learning on edge devices, which are rapidly growing in number and capability.
- North America > United States > California > Los Angeles County > Los Angeles (0.29)
- North America > United States > California > Monterey County > Marina (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- Education (0.68)
- Information Technology > Security & Privacy (0.46)
FLStore: Efficient Federated Learning Storage for non-training workloads
Khan, Ahmad Faraz, Fountain, Samuel, Abdelmoniem, Ahmed M., Butt, Ali R., Anwar, Ali
Federated Learning (FL) is an approach for privacy-preserving Machine Learning (ML), enabling model training across multiple clients without centralized data collection. With an aggregator server coordinating training, aggregating model updates, and storing metadata across rounds. In addition to training, a substantial part of FL systems are the non-training workloads such as scheduling, personalization, clustering, debugging, and incentivization. Most existing systems rely on the aggregator to handle non-training workloads and use cloud services for data storage. This results in high latency and increased costs as non-training workloads rely on large volumes of metadata, including weight parameters from client updates, hyperparameters, and aggregated updates across rounds, making the situation even worse. We propose FLStore, a serverless framework for efficient FL non-training workloads and storage. FLStore unifies the data and compute planes on a serverless cache, enabling locality-aware execution via tailored caching policies to reduce latency and costs. Per our evaluations, compared to cloud object store based aggregator server FLStore reduces per request average latency by 71% and costs by 92.45%, with peak improvements of 99.7% and 98.8%, respectively. Compared to an in-memory cloud cache based aggregator server, FLStore reduces average latency by 64.6% and costs by 98.83%, with peak improvements of 98.8% and 99.6%, respectively. FLStore integrates seamlessly with existing FL frameworks with minimal modifications, while also being fault-tolerant and highly scalable.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Minnesota (0.04)
- North America > United States > California > Santa Clara County > Santa Clara (0.04)
- (3 more...)
A new framework for prognostics in decentralized industries: Enhancing fairness, security, and transparency through Blockchain and Federated Learning
Pham, T. Q. D., Tran, K. D., Nguyen, Khanh T. P., Tran, X. V., Tran, K. P.
As global industries transition towards Industry 5.0 predictive maintenance PM remains crucial for cost effective operations resilience and minimizing downtime in increasingly smart manufacturing environments In this chapter we explore how the integration of Federated Learning FL and blockchain BC technologies enhances the prediction of machinerys Remaining Useful Life RUL within decentralized and human centric industrial ecosystems Traditional centralized data approaches raise concerns over privacy security and scalability especially as Artificial intelligence AI driven smart manufacturing becomes more prevalent This chapter leverages FL to enable localized model training across multiple sites while utilizing BC to ensure trust transparency and data integrity across the network This BC integrated FL framework optimizes RUL predictions enhances data privacy and security establishes transparency and promotes collaboration in decentralized manufacturing It addresses key challenges such as maintaining privacy and security ensuring transparency and fairness and incentivizing participation in decentralized networks Experimental validation using the NASA CMAPSS dataset demonstrates the model effectiveness in real world scenarios and we extend our findings to the broader research community through open source code on GitHub inviting collaborative development to drive innovation in Industry 5.0
- North America > United States (0.34)
- Europe > France > Hauts-de-France > Nord > Lille (0.04)
- Europe > Belgium (0.04)
- (3 more...)
Communication-Efficient and Privacy-Adaptable Mechanism for Federated Learning
Ling, Chih Wei, Wu, Youqi, Sun, Jiande, Li, Cheuk Ting, Song, Linqi, Xu, Weitao
Training machine learning models on decentralized private data via federated learning (FL) poses two key challenges: communication efficiency and privacy protection. In this work, we address these challenges within the trusted aggregator model by introducing a novel approach called the Communication-Efficient and Privacy-Adaptable Mechanism (CEPAM), achieving both objectives simultaneously. In particular, CEPAM leverages the rejection-sampled universal quantizer (RSUQ), a construction of randomized vector quantizer whose resulting distortion is equivalent to a prescribed noise, such as Gaussian or Laplace noise, enabling joint differential privacy and compression. Moreover, we analyze the trade-offs among user privacy, global utility, and transmission rate of CEPAM by defining appropriate metrics for FL with differential privacy and compression. Our CEPAM provides the additional benefit of privacy adaptability, allowing clients and the server to customize privacy protection based on required accuracy and protection. We assess CEPAM's utility performance using MNIST dataset, demonstrating that CEPAM surpasses baseline models in terms of learning accuracy.
- Asia > China > Hong Kong (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
Fed-CO _{2} : Cooperation of Online and Offline Models for Severe Data Heterogeneity in Federated Learning
Federated Learning (FL) has emerged as a promising distributed learning paradigm that enables multiple clients to learn a global model collaboratively without sharing their private data. However, the effectiveness of FL is highly dependent on the quality of the data that is being used for training. In particular, data heterogeneity issues, such as label distribution skew and feature skew, can significantly impact the performance of FL. Previous studies in FL have primarily focused on addressing label distribution skew data heterogeneity, while only a few recent works have made initial progress in tackling feature skew issues. Notably, these two forms of data heterogeneity have been studied separately and have not been well explored within a unified FL framework.
ExclaveFL: Providing Transparency to Federated Learning using Exclaves
Guo, Jinnan, Vaswani, Kapil, Paverd, Andrew, Pietzuch, Peter
In federated learning (FL), data providers jointly train a model without disclosing their training data. Despite its privacy benefits, a malicious data provider can simply deviate from the correct training protocol without being detected, thus attacking the trained model. While current solutions have explored the use of trusted execution environment (TEEs) to combat such attacks, there is a mismatch with the security needs of FL: TEEs offer confidentiality guarantees, which are unnecessary for FL and make them vulnerable to side-channel attacks, and focus on coarse-grained attestation, which does not capture the execution of FL training. We describe ExclaveFL, an FL platform that achieves end-to-end transparency and integrity for detecting attacks. ExclaveFL achieves this by employing a new hardware security abstraction, exclaves, which focus on integrity-only guarantees. ExclaveFL uses exclaves to protect the execution of FL tasks, while generating signed statements containing fine-grained, hardware-based attestation reports of task execution at runtime. ExclaveFL then enables auditing using these statements to construct an attested dataflow graph and then check that the FL training jobs satisfies claims, such as the absence of attacks. Our experiments show that ExclaveFL introduces a less than 9% overhead while detecting a wide-range of attacks.
- North America > United States > New York > New York County > New York City (0.04)
- South America > Peru > Lima Department > Lima Province > Lima (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- (4 more...)